Add --thinking flag to orchestrator for reasoning depth control#40
Open
ScuttleBot wants to merge 1 commit into
Open
Add --thinking flag to orchestrator for reasoning depth control#40ScuttleBot wants to merge 1 commit into
ScuttleBot wants to merge 1 commit into
Conversation
Passes --thinking through to Vultr instances via benchmark_thinking.txt, which bench_runner.sh reads and passes to benchmark.py. Example: uv run orchestrate_vultr.py --models model1 --thinking medium
Code Review SummaryStatus: No Issues Found | Recommendation: Merge Solid implementation. The file-based handoff pattern is consistent with existing Files Reviewed (2 files)
Reviewed by claude-4.6-sonnet-20260217 · 97,911 tokens |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
--thinkingflag toorchestrate_vultr.pyso users can set reasoning depth when benchmarking models that support it (e.g., mercury-2).How it works
--thinking mediumwrites the level to/root/benchmark_thinking.txton each Vultr instance--thinking <level>tobenchmark.py--thinkingand passes it through to OpenClawUsage
Valid levels
off, minimal, low, medium, high, xhigh, adaptive
Notes
benchmark.pyscript already had--thinkingsupport — this just wires it through the orchestrator--thinkingis omitted, behavior is unchanged (uses model default)--no-fail-fastand--official-key